Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics

نویسندگان

  • Chin-Yew Lin
  • Eduard H. Hovy
چکیده

Following the recent adoption by the machine translation community of automatic evaluation using the BLEU/NIST scoring process, we conduct an in-depth study of a similar idea for evaluating summaries. The results show that automatic evaluation using unigram cooccurrences between summary pairs correlates surprising well with human evaluations, based on various statistical metrics; while direct application of the BLEU evaluation procedure does not always give good results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Unified Framework For Automatic Evaluation Using 4-Gram Co-occurrence Statistics

In this paper we propose a unified framework for automatic evaluation of NLP applications using N-gram co-occurrence statistics. The automatic evaluation metrics proposed to date for Machine Translation and Automatic Summarization are particular instances from the family of metrics we propose. We show that different members of the same family of metrics explain best the variations obtained with...

متن کامل

Automatic summary assessment for intelligent tutoring systems

Summary writing is an important part of many English Language Examinations. As grading students’ summary writings is a very time-consuming task, computer-assisted assessment will help teachers carry out the grading more effectively. Several techniques such as Latent Semantic Analysis (LSA), n-gram co-occurrence and BLEU have been proposed to support automatic evaluation of summaries. However, t...

متن کامل

Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics

Evaluation is recognized as an extremely helpful forcing function in Human Language Technology R&D. Unfortunately, evaluation has not been a very powerful tool in machine translation (MT) research because it requires human judgments and is thus expensive and time-consuming and not easily factored into the MT research agenda. However, at the July 2001 TIDES PI meeting in Philadelphia, IBM descri...

متن کامل

Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics

Evaluation is recognized as an extremely helpful forcing function in Human Language Technology R&D. Unfortunately, evaluation has not been a very powerful tool in machine translation (MT) research because it requires human judgments and is thus expensive and time-consuming and not easily factored into the MT research agenda. However, at the July 2001 TIDES PI meeting in Philadelphia, IBM descri...

متن کامل

Manual And Automatic Evaluation Of Summaries

In this paper we discuss manual and automatic evaluations of summaries using data from the Document Understanding Conference 2001 (DUC-2001). We first show the instability of the manual evaluation. Specifically, the low interhuman agreement indicates that more reference summaries are needed. To investigate the feasibility of automated summary evaluation based on the recent BLEU method from mach...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003